Update cuda_specs.py by asmertpc-cloud · Pull Request #1833 · bitsandbytes-foundation/bitsandbytes

asmertpc-cloud · 2025-12-22T05:37:50Z

fix for windows

oops

github-actions · 2026-01-07T16:11:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

TimDettmers

PR Review: #1833 — Update cuda_specs.py

Platform fix: attempts to make ROCm GPU detection work on Windows by switching rocminfo to hipinfo and adjusting warp-size regex parsing.

Blocking issues (2):

get_rocm_gpu_arch() will not work on Windows — This PR changes the subprocess command from rocminfo to hipinfo on Windows, but keeps the same regex pattern Name:\s+gfx(...). On Windows, hipinfo.exe reports the GPU architecture under gcnArchName:, not Name:. The regex will never match hipinfo output, so get_rocm_gpu_arch() will always return "unknown" on Windows. PR #1846 handles this correctly by using distinct regex patterns per platform (gcnArchName:\s+(gfx...) for Windows vs Name:\s+gfx(...) for Linux).
Superseded by PR #1846 — PR #1846 (by the same author) is a more comprehensive fix that addresses the same cuda_specs.py changes plus CMakeLists.txt build support, csrc/ops.cuh and csrc/ops_hip.cuh Windows portability fixes, and NOMINMAX defines. PR #1843 (by a different author) also addresses the same problem. This PR (#1833) is the least complete of the three and has correctness issues that #1846 does not. I recommend closing this PR in favor of #1846.

Additional issues:

Import ordering: import sys is placed after import torch, violating PEP 8 (stdlib imports before third-party). The linter would catch this.
Module-level side effect: The if sys.platform == "win32" block runs at module import time for all users, including non-ROCm users on Linux. While harmless (it just sets a string variable), it's cleaner to keep the platform check inside the functions that need it, as #1846 does.
Warp-size regex fragility: The combined regex (wavefront\s|warp)size:\s+([0-9]{2})(\([x0-9]{4}\))? is fragile. The [0-9]{2} requires exactly two digits, which works for 32 and 64 but fails for any other value. PR #1846's approach of using separate, explicit patterns per platform is more maintainable.
No tests — Bug fix PRs should include regression tests. However, testing subprocess-based hardware detection is difficult, so this is non-blocking.
CI has not run — No checks reported for this fork PR. A maintainer would need to approve the workflow run before merge.
Security: Clear (no new dangerous patterns; subprocess.run with hipinfo/rocminfo is the existing pattern)
Downstream impact: None (internal GPU detection, no public API changes)
Tests: Missing but non-blocking for hardware detection changes
CI: Not triggered (fork PR)
Cross-PR conflicts: Conflicts with #1846 and #1843 — all three modify bitsandbytes/cuda_specs.py for the same purpose. #1846 is the most complete and correct. Recommend closing #1833 and #1843 in favor of #1846.

Verdict: Request changes (close in favor of #1846). The get_rocm_gpu_arch() regex does not match hipinfo.exe output format, so this PR does not achieve its stated goal on Windows. PR #1846 from the same author fixes this correctly and includes the necessary build system changes.

TimDettmers · 2026-02-16T17:56:33Z

bitsandbytes/cuda_specs.py

+if (sys.platform == "win32"):
+    rocminfo = "hipinfo"
+else:
+    rocminfo = "rocminfo"


import sys should be placed before import torch (stdlib before third-party per PEP 8). Also, this module-level platform check runs for all users on import. Consider keeping the platform-conditional logic inside the functions that need it, as PR #1846 does with platform.system() == "Windows" checks.

TimDettmers · 2026-02-16T17:56:33Z

bitsandbytes/cuda_specs.py

-            result = subprocess.run(["rocminfo"], capture_output=True, text=True)
-            match = re.search(r"Name:\s+gfx([a-zA-Z\d]+)", result.stdout)
+            result = subprocess.run([rocminfo], capture_output=True, text=True)
+            match = re.search(r"Name:\s+gfx([a-z\d]+)", result.stdout, re.IGNORECASE)


This regex Name:\s+gfx(...) will not match hipinfo.exe output on Windows. The hipinfo.exe utility reports GPU architecture under gcnArchName:, not Name:. For example, hipinfo outputs gcnArchName: gfx1100, not Name: gfx1100. This means get_rocm_gpu_arch() will always return "unknown" on Windows, defeating the purpose of this PR. PR #1846 handles this correctly with a separate gcnArchName: pattern for Windows.

TimDettmers · 2026-02-16T17:56:33Z

bitsandbytes/cuda_specs.py

-            result = subprocess.run(["rocminfo"], capture_output=True, text=True)
-            match = re.search(r"Wavefront Size:\s+([0-9]{2})\(0x[0-9]{2}\)", result.stdout)
+            result = subprocess.run([rocminfo], capture_output=True, text=True)
+            match = re.search(r"(wavefront\s|warp)size:\s+([0-9]{2})(\([x0-9]{4}\))?", result.stdout, re.IGNORECASE)


The combined regex (wavefront\s|warp)size: is creative but fragile. The [0-9]{2} in group 2 requires exactly two digits, which works for 32/64 but would break for unexpected values. More importantly, the third optional group (\([x0-9]{4}\))? requires exactly 4 characters inside parentheses — this happens to match (0x40) for warp size 64 but would fail for other hex representations. PR #1846's approach of using cleanly separated patterns (warpSize:\s+(\d+) for Windows, the existing Wavefront Size: pattern for Linux) is more robust and readable.

asmertpc-cloud added 3 commits December 22, 2025 08:36

Update cuda_specs.py

05e5a71

fix for windows

Update cuda_specs.py

3364ae3

Update cuda_specs.py

b700ae2

oops

matthewdouglas added Windows ROCm labels Jan 7, 2026

matthewdouglas added this to the v0.50.0 milestone Jan 7, 2026

asmertpc-cloud added 2 commits January 11, 2026 06:02

Merge branch 'main' into patch-1

3eca466

Update cuda_specs.py

653f673

This was referenced Feb 16, 2026

Adding support for building for AMD on Windows #1846

Open

fix ROCm GPU architecture detection failed on windows #1843

Open

TimDettmers reviewed Feb 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update cuda_specs.py#1833

Update cuda_specs.py#1833
asmertpc-cloud wants to merge 5 commits intobitsandbytes-foundation:mainfrom
asmertpc-cloud:patch-1

asmertpc-cloud commented Dec 22, 2025

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

TimDettmers left a comment

Uh oh!

TimDettmers Feb 16, 2026

Uh oh!

TimDettmers Feb 16, 2026

Uh oh!

TimDettmers Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

asmertpc-cloud commented Dec 22, 2025

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

TimDettmers left a comment

Choose a reason for hiding this comment

PR Review: #1833 — Update cuda_specs.py

Uh oh!

TimDettmers Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

TimDettmers Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

TimDettmers Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants